lst = [result1,result2,result3,result4,result5,result6,result7,result8]| 구분 | throw | split | 비고 |
|---|---|---|---|
| 1 | 0.2 | 0.2 | df02 |
| 2 | 0.5 | 0.5 | df50 |
| 3 | 0.3 | 0.05 | |
| 4 | 0.3 | 0.005 | |
| 5 | 0.3 | 0.0005 | |
| 6 | 0.1 | 0.05 | |
| 7 | 0.1 | 0.005 | |
| 8 | 0.1 | 0.0005 |
pd.concat(lst)| accuracy_score | precision_score | recall_score | f1_score | roc_auc_score | |
|---|---|---|---|---|---|
| 0 | 0.974359 | 0.946741 | 0.991674 | 0.968686 | 0.977246 |
| 0 | 0.906260 | 0.047538 | 0.933333 | 0.090468 | 0.919729 |
| 0 | 0.927905 | 0.394945 | 0.833333 | 0.535906 | 0.883106 |
| 0 | 0.921079 | 0.054217 | 0.900000 | 0.102273 | 0.910592 |
| 0 | 0.932234 | 0.004902 | 0.666667 | 0.009732 | 0.799517 |
| 0 | 0.972361 | 0.729977 | 0.708889 | 0.719278 | 0.847551 |
| 0 | 0.986513 | 0.231579 | 0.733333 | 0.352000 | 0.860559 |
| 0 | 0.986069 | 0.228956 | 0.755556 | 0.351421 | 0.871391 |
imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import networkx as nx
import sklearn
import xgboost as xgb
# sklearn
from sklearn import model_selection # split함수이용
from sklearn import ensemble # RF,GBM
from sklearn import metrics
from sklearn.metrics import precision_score, recall_score, f1_score
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
from sklearn.naive_bayes import GaussianNB
# gnn
import torch
import torch.nn.functional as F
import torch_geometric
from torch_geometric.nn import GCNConv
# autogluon
from autogluon.tabular import TabularDataset, TabularPredictor def throw(df, fraud_rate): # 사기 거래 비율에 맞춰 버려지는 함수!
df1 = df[df['is_fraud'] == 1].copy()
df0 = df[df['is_fraud'] == 0].copy()
df0_downsample = (len(df1) * (1-fraud_rate)) / (len(df0) * fraud_rate)
df0_down = df0.sample(frac=df0_downsample, random_state=42)
df_p = pd.concat([df1, df0_down])
return df_p
def split_dataframe(data_frame, test_fraud_rate, test_rate=0.3):
n = len(data_frame)
# 사기 거래와 정상 거래를 분리
fraud_data = data_frame[data_frame['is_fraud'] == 1]
normal_data = data_frame[data_frame['is_fraud'] == 0]
# 테스트 데이터 크기 계산
test_samples = int(test_fraud_rate * (n * test_rate))
remaining_test_samples = int(n * test_rate) - test_samples
# 사기 거래 및 정상 거래에서 무작위로 테스트 데이터 추출
test_fraud_data = fraud_data.sample(n=test_samples, replace=False)
test_normal_data = normal_data.sample(n=remaining_test_samples, replace=False)
# 테스트 데이터 합치기
test_data = pd.concat([test_normal_data, test_fraud_data])
# 훈련 데이터 생성
train_data = data_frame[~data_frame.index.isin(test_data.index)]
return train_data, test_data
def concat(df_tr, df_tst):
df = pd.concat([df_tr, df_tst])
train_mask = np.concatenate((np.full(len(df_tr), True), np.full(len(df_tst), False))) # index꼬이는거 방지하기 위해서? ★ (이거,, 훔,,?(
test_mask = np.concatenate((np.full(len(df_tr), False), np.full(len(df_tst), True)))
mask = (train_mask, test_mask)
return df, mask
def evaluation(y, yhat):
metrics = [sklearn.metrics.accuracy_score,
sklearn.metrics.precision_score,
sklearn.metrics.recall_score,
sklearn.metrics.f1_score,
sklearn.metrics.roc_auc_score]
return pd.DataFrame({m.__name__:[m(y,yhat).round(6)] for m in metrics})
def compute_time_difference(group):
n = len(group)
result = []
for i in range(n):
for j in range(n):
time_difference = abs((group.iloc[i].trans_date_trans_time - group.iloc[j].trans_date_trans_time).total_seconds())
result.append([group.iloc[i].name, group.iloc[j].name, time_difference])
return result
def edge_index_save(df, unique_col, theta, gamma):
groups = df.groupby(unique_col)
edge_index = np.array([item for sublist in (compute_time_difference(group) for _, group in groups) for item in sublist])
edge_index = edge_index.astype(np.float64)
filename = f"edge_index_attempt{self.save_attempt}_{str(unique_col).replace(' ', '').replace('_', '')}.npy"
while os.path.exists(filename):
self.save_attempt += 1
filename = f"edge_index_attempt{self.save_attempt}_{str(unique_col).replace(' ', '').replace('_', '')}.npy"
np.save(filename, edge_index)
#tetha = edge_index_plust_itme[:,].mean()
edge_index[:,2] = (np.exp(-edge_index[:,2]/(theta)) != 1)*(np.exp(-edge_index[:,2]/(theta))).tolist()
edge_index = torch.tensor([(int(row[0]), int(row[1])) for row in edge_index if row[2] > gamma], dtype=torch.long).t()
return edge_index
def edge_index(df, unique_col, theta, gamma):
groups = df.groupby(unique_col)
edge_index = np.array([item for sublist in (compute_time_difference(group) for _, group in groups) for item in sublist])
edge_index = edge_index.astype(np.float64)
# filename = f"edge_index_attempt{self.save_attempt}_{str(unique_col).replace(' ', '').replace('_', '')}.npy"
# while os.path.exists(filename):
# self.save_attempt += 1
# filename = f"edge_index_attempt{self.save_attempt}_{str(unique_col).replace(' ', '').replace('_', '')}.npy"
# np.save(filename, edge_index)
#tetha = edge_index_plust_itme[:,].mean()
edge_index[:,2] = (np.exp(-edge_index[:,2]/(theta)) != 1)*(np.exp(-edge_index[:,2]/(theta))).tolist()
edge_index = torch.tensor([(int(row[0]), int(row[1])) for row in edge_index if row[2] > gamma], dtype=torch.long).t()
return edge_indexfraudTrain = pd.read_csv("~/Desktop/fraudTrain.csv").iloc[:,1:]
fraudTrain = fraudTrain.assign(trans_date_trans_time= list(map(lambda x: pd.to_datetime(x), fraudTrain.trans_date_trans_time)))
fraudTrain| trans_date_trans_time | cc_num | merchant | category | amt | first | last | gender | street | city | ... | lat | long | city_pop | job | dob | trans_num | unix_time | merch_lat | merch_long | is_fraud | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2019-01-01 00:00:00 | 2.703190e+15 | fraud_Rippin, Kub and Mann | misc_net | 4.97 | Jennifer | Banks | F | 561 Perry Cove | Moravian Falls | ... | 36.0788 | -81.1781 | 3495 | Psychologist, counselling | 1988-03-09 | 0b242abb623afc578575680df30655b9 | 1325376018 | 36.011293 | -82.048315 | 0 |
| 1 | 2019-01-01 00:00:00 | 6.304230e+11 | fraud_Heller, Gutmann and Zieme | grocery_pos | 107.23 | Stephanie | Gill | F | 43039 Riley Greens Suite 393 | Orient | ... | 48.8878 | -118.2105 | 149 | Special educational needs teacher | 1978-06-21 | 1f76529f8574734946361c461b024d99 | 1325376044 | 49.159047 | -118.186462 | 0 |
| 2 | 2019-01-01 00:00:00 | 3.885950e+13 | fraud_Lind-Buckridge | entertainment | 220.11 | Edward | Sanchez | M | 594 White Dale Suite 530 | Malad City | ... | 42.1808 | -112.2620 | 4154 | Nature conservation officer | 1962-01-19 | a1a22d70485983eac12b5b88dad1cf95 | 1325376051 | 43.150704 | -112.154481 | 0 |
| 3 | 2019-01-01 00:01:00 | 3.534090e+15 | fraud_Kutch, Hermiston and Farrell | gas_transport | 45.00 | Jeremy | White | M | 9443 Cynthia Court Apt. 038 | Boulder | ... | 46.2306 | -112.1138 | 1939 | Patent attorney | 1967-01-12 | 6b849c168bdad6f867558c3793159a81 | 1325376076 | 47.034331 | -112.561071 | 0 |
| 4 | 2019-01-01 00:03:00 | 3.755340e+14 | fraud_Keeling-Crist | misc_pos | 41.96 | Tyler | Garcia | M | 408 Bradley Rest | Doe Hill | ... | 38.4207 | -79.4629 | 99 | Dance movement psychotherapist | 1986-03-28 | a41d7549acf90789359a9aa5346dcb46 | 1325376186 | 38.674999 | -78.632459 | 0 |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 1048570 | 2020-03-10 16:07:00 | 6.011980e+15 | fraud_Fadel Inc | health_fitness | 77.00 | Haley | Wagner | F | 05561 Farrell Crescent | Annapolis | ... | 39.0305 | -76.5515 | 92106 | Accountant, chartered certified | 1943-05-28 | 45ecd198c65e81e597db22e8d2ef7361 | 1362931649 | 38.779464 | -76.317042 | 0 |
| 1048571 | 2020-03-10 16:07:00 | 4.839040e+15 | fraud_Cremin, Hamill and Reichel | misc_pos | 116.94 | Meredith | Campbell | F | 043 Hanson Turnpike | Hedrick | ... | 41.1826 | -92.3097 | 1583 | Geochemist | 1999-06-28 | c00ce51c6ebb7657474a77b9e0b51f34 | 1362931670 | 41.400318 | -92.726724 | 0 |
| 1048572 | 2020-03-10 16:08:00 | 5.718440e+11 | fraud_O'Connell, Botsford and Hand | home | 21.27 | Susan | Mills | F | 005 Cody Estates | Louisville | ... | 38.2507 | -85.7476 | 736284 | Engineering geologist | 1952-04-02 | 17c9dc8b2a6449ca2473726346e58e6c | 1362931711 | 37.293339 | -84.798122 | 0 |
| 1048573 | 2020-03-10 16:08:00 | 4.646850e+18 | fraud_Thompson-Gleason | health_fitness | 9.52 | Julia | Bell | F | 576 House Crossroad | West Sayville | ... | 40.7320 | -73.1000 | 4056 | Film/video editor | 1990-06-25 | 5ca650881b48a6a38754f841c23b77ab | 1362931718 | 39.773077 | -72.213209 | 0 |
| 1048574 | 2020-03-10 16:08:00 | 2.283740e+15 | fraud_Buckridge PLC | misc_pos | 6.81 | Shannon | Williams | F | 9345 Spencer Junctions Suite 183 | Alpharetta | ... | 34.0770 | -84.3033 | 165556 | Prison officer | 1997-12-27 | 8d0a575fe635bbde12f1a2bffc126731 | 1362931730 | 33.601468 | -83.891921 | 0 |
1048575 rows × 22 columns
Autogluon(df02)
fraudTrain = fraudTrain[["amt","is_fraud"]]def auto(df,test_fraud_rate):
df_tr, df_tst = split_dataframe(df, test_fraud_rate)
tr = TabularDataset(df_tr)
tst = TabularDataset(df_tst)
predictr = TabularPredictor("is_fraud")
predictr.fit(tr, presets='best_quality')
y = tst.is_fraud
yhat = predictr.predict(tst)
result = evaluation(y,yhat)
return resultdf = throw(fraudTrain, 0.2)
tr = TabularDataset(df_tr)
tst = TabularDataset(df_tst)
predictr = TabularPredictor("is_fraud")
predictr.fit(tr, presets='best_quality')
y = tst.is_fraud
yhat = predictr.predict(tst)
result1 = evaluation(y,yhat)No path specified. Models will be saved in: "AutogluonModels/ag-20240126_034006/"
Presets specified: ['best_quality']
Stack configuration (auto_stack=True): num_stack_levels=0, num_bag_folds=8, num_bag_sets=1
Beginning AutoGluon training ...
AutoGluon will save models to "AutogluonModels/ag-20240126_034006/"
AutoGluon Version: 0.8.2
Python Version: 3.8.18
Operating System: Linux
Platform Machine: x86_64
Platform Version: #38~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 2 18:01:13 UTC 2
Disk Space Avail: 607.91 GB / 982.82 GB (61.9%)
Train Data Rows: 7007
Train Data Columns: 21
Label Column: is_fraud
Preprocessing data ...
AutoGluon infers your prediction problem is: 'binary' (because only two unique label-values observed).
2 unique label values: [1, 0]
If 'binary' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Selected class <--> label mapping: class 1 = 1, class 0 = 0
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 21176.89 MB
Train Data (Original) Memory Usage: 5.95 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 1 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Fitting DatetimeFeatureGenerator...
Fitting TextSpecialFeatureGenerator...
Fitting BinnedFeatureGenerator...
Fitting DropDuplicatesFeatureGenerator...
Fitting TextNgramFeatureGenerator...
Fitting CountVectorizer for text features: ['street']
CountVectorizer fit with vocabulary size = 2
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Stage 5 Generators:
Fitting DropDuplicatesFeatureGenerator...
Unused Original Features (Count: 1): ['trans_num']
These features were not used to generate any of the output features. Add a feature generator compatible with these features to utilize them.
Features can also be unused if they carry very little information, such as being categorical but having almost entirely unique values or being duplicates of other features.
These features do not need to be present at inference time.
('object', []) : 1 | ['trans_num']
Types of features in original data (raw dtype, special dtypes):
('datetime', []) : 1 | ['trans_date_trans_time']
('float', []) : 6 | ['cc_num', 'amt', 'lat', 'long', 'merch_lat', ...]
('int', []) : 3 | ['zip', 'city_pop', 'unix_time']
('object', []) : 8 | ['merchant', 'category', 'first', 'last', 'gender', ...]
('object', ['datetime_as_object']) : 1 | ['dob']
('object', ['text']) : 1 | ['street']
Types of features in processed data (raw dtype, special dtypes):
('category', []) : 7 | ['merchant', 'category', 'first', 'last', 'city', ...]
('category', ['text_as_category']) : 1 | ['street']
('float', []) : 6 | ['cc_num', 'amt', 'lat', 'long', 'merch_lat', ...]
('int', []) : 3 | ['zip', 'city_pop', 'unix_time']
('int', ['binned', 'text_special']) : 8 | ['street.char_count', 'street.word_count', 'street.capital_ratio', 'street.lower_ratio', 'street.digit_ratio', ...]
('int', ['bool']) : 1 | ['gender']
('int', ['datetime_as_int']) : 10 | ['trans_date_trans_time', 'trans_date_trans_time.year', 'trans_date_trans_time.month', 'trans_date_trans_time.day', 'trans_date_trans_time.dayofweek', ...]
('int', ['text_ngram']) : 1 | ['__nlp__.suite']
1.0s = Fit runtime
20 features in original data used to generate 37 features in processed data.
Train Data (Processed) Memory Usage: 1.25 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 1.01s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
To change this, specify the eval_metric parameter of Predictor()
User-specified model hyperparameters to be fit:
{
'NN_TORCH': {},
'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, 'GBMLarge'],
'CAT': {},
'XGB': {},
'FASTAI': {},
'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
Fitting 13 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ...
Exception ignored on calling ctypes callback function: <function _ThreadpoolInfo._find_modules_with_dl_iterate_phdr.<locals>.match_module_callback at 0x7f36a79fe700>
Traceback (most recent call last):
File "/home/coco/anaconda3/envs/py38/lib/python3.8/site-packages/threadpoolctl.py", line 400, in match_module_callback
self._make_module_from_path(filepath)
File "/home/coco/anaconda3/envs/py38/lib/python3.8/site-packages/threadpoolctl.py", line 515, in _make_module_from_path
module = module_class(filepath, prefix, user_api, internal_api)
File "/home/coco/anaconda3/envs/py38/lib/python3.8/site-packages/threadpoolctl.py", line 606, in __init__
self.version = self.get_version()
File "/home/coco/anaconda3/envs/py38/lib/python3.8/site-packages/threadpoolctl.py", line 646, in get_version
config = get_config().split()
AttributeError: 'NoneType' object has no attribute 'split'
0.84 = Validation score (accuracy)
0.0s = Training runtime
0.05s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ...
Exception ignored on calling ctypes callback function: <function _ThreadpoolInfo._find_modules_with_dl_iterate_phdr.<locals>.match_module_callback at 0x7f36a79fe700>
Traceback (most recent call last):
File "/home/coco/anaconda3/envs/py38/lib/python3.8/site-packages/threadpoolctl.py", line 400, in match_module_callback
self._make_module_from_path(filepath)
File "/home/coco/anaconda3/envs/py38/lib/python3.8/site-packages/threadpoolctl.py", line 515, in _make_module_from_path
module = module_class(filepath, prefix, user_api, internal_api)
File "/home/coco/anaconda3/envs/py38/lib/python3.8/site-packages/threadpoolctl.py", line 606, in __init__
self.version = self.get_version()
File "/home/coco/anaconda3/envs/py38/lib/python3.8/site-packages/threadpoolctl.py", line 646, in get_version
config = get_config().split()
AttributeError: 'NoneType' object has no attribute 'split'
0.8818 = Validation score (accuracy)
0.0s = Training runtime
0.04s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9575 = Validation score (accuracy)
1.72s = Training runtime
0.12s = Validation runtime
Fitting model: LightGBM_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9636 = Validation score (accuracy)
1.3s = Training runtime
0.05s = Validation runtime
Fitting model: RandomForestGini_BAG_L1 ...
0.9595 = Validation score (accuracy)
0.53s = Training runtime
0.15s = Validation runtime
Fitting model: RandomForestEntr_BAG_L1 ...
0.9593 = Validation score (accuracy)
0.47s = Training runtime
0.17s = Validation runtime
Fitting model: CatBoost_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9873 = Validation score (accuracy)
10.94s = Training runtime
0.11s = Validation runtime
Fitting model: ExtraTreesGini_BAG_L1 ...
0.9602 = Validation score (accuracy)
0.31s = Training runtime
0.17s = Validation runtime
Fitting model: ExtraTreesEntr_BAG_L1 ...
0.9589 = Validation score (accuracy)
0.33s = Training runtime
0.18s = Validation runtime
Fitting model: NeuralNetFastAI_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9235 = Validation score (accuracy)
11.81s = Training runtime
0.17s = Validation runtime
Fitting model: XGBoost_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9747 = Validation score (accuracy)
4.35s = Training runtime
0.11s = Validation runtime
Fitting model: NeuralNetTorch_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.8938 = Validation score (accuracy)
31.61s = Training runtime
0.15s = Validation runtime
Fitting model: LightGBMLarge_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.959 = Validation score (accuracy)
4.27s = Training runtime
0.11s = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
0.9873 = Validation score (accuracy)
1.58s = Training runtime
0.01s = Validation runtime
AutoGluon training complete, total runtime = 78.9s ... Best model: "WeightedEnsemble_L2"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20240126_034006/")
Autogluon(df50)
df = throw(fraudTrain, 0.5)
tr = TabularDataset(df_tr)
tst = TabularDataset(df_tst)
predictr = TabularPredictor("is_fraud")
predictr.fit(tr, presets='best_quality')
y = tst.is_fraud
yhat = predictr.predict(tst)
result2 = evaluation(y,yhat)No path specified. Models will be saved in: "AutogluonModels/ag-20240126_034125/"
Presets specified: ['best_quality']
Stack configuration (auto_stack=True): num_stack_levels=0, num_bag_folds=8, num_bag_sets=1
Beginning AutoGluon training ...
AutoGluon will save models to "AutogluonModels/ag-20240126_034125/"
AutoGluon Version: 0.8.2
Python Version: 3.8.18
Operating System: Linux
Platform Machine: x86_64
Platform Version: #38~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 2 18:01:13 UTC 2
Disk Space Avail: 607.65 GB / 982.82 GB (61.8%)
Train Data Rows: 7007
Train Data Columns: 21
Label Column: is_fraud
Preprocessing data ...
AutoGluon infers your prediction problem is: 'binary' (because only two unique label-values observed).
2 unique label values: [1, 0]
If 'binary' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Selected class <--> label mapping: class 1 = 1, class 0 = 0
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 21211.51 MB
Train Data (Original) Memory Usage: 5.95 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 1 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Fitting DatetimeFeatureGenerator...
Fitting TextSpecialFeatureGenerator...
Fitting BinnedFeatureGenerator...
Fitting DropDuplicatesFeatureGenerator...
Fitting TextNgramFeatureGenerator...
Fitting CountVectorizer for text features: ['street']
CountVectorizer fit with vocabulary size = 2
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Stage 5 Generators:
Fitting DropDuplicatesFeatureGenerator...
Unused Original Features (Count: 1): ['trans_num']
These features were not used to generate any of the output features. Add a feature generator compatible with these features to utilize them.
Features can also be unused if they carry very little information, such as being categorical but having almost entirely unique values or being duplicates of other features.
These features do not need to be present at inference time.
('object', []) : 1 | ['trans_num']
Types of features in original data (raw dtype, special dtypes):
('datetime', []) : 1 | ['trans_date_trans_time']
('float', []) : 6 | ['cc_num', 'amt', 'lat', 'long', 'merch_lat', ...]
('int', []) : 3 | ['zip', 'city_pop', 'unix_time']
('object', []) : 8 | ['merchant', 'category', 'first', 'last', 'gender', ...]
('object', ['datetime_as_object']) : 1 | ['dob']
('object', ['text']) : 1 | ['street']
Types of features in processed data (raw dtype, special dtypes):
('category', []) : 7 | ['merchant', 'category', 'first', 'last', 'city', ...]
('category', ['text_as_category']) : 1 | ['street']
('float', []) : 6 | ['cc_num', 'amt', 'lat', 'long', 'merch_lat', ...]
('int', []) : 3 | ['zip', 'city_pop', 'unix_time']
('int', ['binned', 'text_special']) : 8 | ['street.char_count', 'street.word_count', 'street.capital_ratio', 'street.lower_ratio', 'street.digit_ratio', ...]
('int', ['bool']) : 1 | ['gender']
('int', ['datetime_as_int']) : 10 | ['trans_date_trans_time', 'trans_date_trans_time.year', 'trans_date_trans_time.month', 'trans_date_trans_time.day', 'trans_date_trans_time.dayofweek', ...]
('int', ['text_ngram']) : 1 | ['__nlp__.suite']
0.7s = Fit runtime
20 features in original data used to generate 37 features in processed data.
Train Data (Processed) Memory Usage: 1.25 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.76s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
To change this, specify the eval_metric parameter of Predictor()
User-specified model hyperparameters to be fit:
{
'NN_TORCH': {},
'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, 'GBMLarge'],
'CAT': {},
'XGB': {},
'FASTAI': {},
'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
Fitting 13 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ...
Exception ignored on calling ctypes callback function: <function _ThreadpoolInfo._find_modules_with_dl_iterate_phdr.<locals>.match_module_callback at 0x7f368c155f70>
Traceback (most recent call last):
File "/home/coco/anaconda3/envs/py38/lib/python3.8/site-packages/threadpoolctl.py", line 400, in match_module_callback
self._make_module_from_path(filepath)
File "/home/coco/anaconda3/envs/py38/lib/python3.8/site-packages/threadpoolctl.py", line 515, in _make_module_from_path
module = module_class(filepath, prefix, user_api, internal_api)
File "/home/coco/anaconda3/envs/py38/lib/python3.8/site-packages/threadpoolctl.py", line 606, in __init__
self.version = self.get_version()
File "/home/coco/anaconda3/envs/py38/lib/python3.8/site-packages/threadpoolctl.py", line 646, in get_version
config = get_config().split()
AttributeError: 'NoneType' object has no attribute 'split'
0.84 = Validation score (accuracy)
0.0s = Training runtime
0.04s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ...
Exception ignored on calling ctypes callback function: <function _ThreadpoolInfo._find_modules_with_dl_iterate_phdr.<locals>.match_module_callback at 0x7f368c155310>
Traceback (most recent call last):
File "/home/coco/anaconda3/envs/py38/lib/python3.8/site-packages/threadpoolctl.py", line 400, in match_module_callback
self._make_module_from_path(filepath)
File "/home/coco/anaconda3/envs/py38/lib/python3.8/site-packages/threadpoolctl.py", line 515, in _make_module_from_path
module = module_class(filepath, prefix, user_api, internal_api)
File "/home/coco/anaconda3/envs/py38/lib/python3.8/site-packages/threadpoolctl.py", line 606, in __init__
self.version = self.get_version()
File "/home/coco/anaconda3/envs/py38/lib/python3.8/site-packages/threadpoolctl.py", line 646, in get_version
config = get_config().split()
AttributeError: 'NoneType' object has no attribute 'split'
0.8818 = Validation score (accuracy)
0.0s = Training runtime
0.04s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9575 = Validation score (accuracy)
1.65s = Training runtime
0.13s = Validation runtime
Fitting model: LightGBM_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9636 = Validation score (accuracy)
1.51s = Training runtime
0.05s = Validation runtime
Fitting model: RandomForestGini_BAG_L1 ...
0.9595 = Validation score (accuracy)
0.47s = Training runtime
0.16s = Validation runtime
Fitting model: RandomForestEntr_BAG_L1 ...
0.9593 = Validation score (accuracy)
0.53s = Training runtime
0.16s = Validation runtime
Fitting model: CatBoost_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9873 = Validation score (accuracy)
10.84s = Training runtime
0.1s = Validation runtime
Fitting model: ExtraTreesGini_BAG_L1 ...
0.9602 = Validation score (accuracy)
0.32s = Training runtime
0.18s = Validation runtime
Fitting model: ExtraTreesEntr_BAG_L1 ...
0.9589 = Validation score (accuracy)
0.34s = Training runtime
0.18s = Validation runtime
Fitting model: NeuralNetFastAI_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9235 = Validation score (accuracy)
11.65s = Training runtime
0.16s = Validation runtime
Fitting model: XGBoost_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9747 = Validation score (accuracy)
4.41s = Training runtime
0.11s = Validation runtime
Fitting model: NeuralNetTorch_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.8938 = Validation score (accuracy)
30.73s = Training runtime
0.14s = Validation runtime
Fitting model: LightGBMLarge_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.959 = Validation score (accuracy)
4.32s = Training runtime
0.11s = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
0.9873 = Validation score (accuracy)
1.56s = Training runtime
0.01s = Validation runtime
AutoGluon training complete, total runtime = 77.6s ... Best model: "WeightedEnsemble_L2"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20240126_034125/")
Autogluon(0.3 / 0.05)
df = throw(fraudTrain, 0.3)
result3 = auto(df,0.05)No path specified. Models will be saved in: "AutogluonModels/ag-20240126_050219/"
Presets specified: ['best_quality']
Stack configuration (auto_stack=True): num_stack_levels=0, num_bag_folds=8, num_bag_sets=1
Beginning AutoGluon training ...
AutoGluon will save models to "AutogluonModels/ag-20240126_050219/"
AutoGluon Version: 0.8.2
Python Version: 3.8.18
Operating System: Linux
Platform Machine: x86_64
Platform Version: #38~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 2 18:01:13 UTC 2
Disk Space Avail: 606.47 GB / 982.82 GB (61.7%)
Train Data Rows: 14014
Train Data Columns: 1
Label Column: is_fraud
Preprocessing data ...
AutoGluon infers your prediction problem is: 'binary' (because only two unique label-values observed).
2 unique label values: [1, 0]
If 'binary' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Selected class <--> label mapping: class 1 = 1, class 0 = 0
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 21246.58 MB
Train Data (Original) Memory Usage: 0.11 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Stage 5 Generators:
Fitting DropDuplicatesFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('float', []) : 1 | ['amt']
Types of features in processed data (raw dtype, special dtypes):
('float', []) : 1 | ['amt']
0.0s = Fit runtime
1 features in original data used to generate 1 features in processed data.
Train Data (Processed) Memory Usage: 0.11 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.03s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
To change this, specify the eval_metric parameter of Predictor()
User-specified model hyperparameters to be fit:
{
'NN_TORCH': {},
'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, 'GBMLarge'],
'CAT': {},
'XGB': {},
'FASTAI': {},
'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
Fitting 13 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ...
0.8836 = Validation score (accuracy)
0.0s = Training runtime
0.01s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ...
0.8731 = Validation score (accuracy)
0.0s = Training runtime
0.01s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.8884 = Validation score (accuracy)
0.63s = Training runtime
0.05s = Validation runtime
Fitting model: LightGBM_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.8961 = Validation score (accuracy)
0.65s = Training runtime
0.02s = Validation runtime
Fitting model: RandomForestGini_BAG_L1 ...
0.8669 = Validation score (accuracy)
0.43s = Training runtime
0.25s = Validation runtime
Fitting model: RandomForestEntr_BAG_L1 ...
0.8669 = Validation score (accuracy)
0.6s = Training runtime
0.25s = Validation runtime
Fitting model: CatBoost_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.8964 = Validation score (accuracy)
1.72s = Training runtime
0.01s = Validation runtime
Fitting model: ExtraTreesGini_BAG_L1 ...
0.8714 = Validation score (accuracy)
0.34s = Training runtime
0.28s = Validation runtime
Fitting model: ExtraTreesEntr_BAG_L1 ...
0.8713 = Validation score (accuracy)
0.35s = Training runtime
0.28s = Validation runtime
Fitting model: NeuralNetFastAI_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.8827 = Validation score (accuracy)
12.2s = Training runtime
0.12s = Validation runtime
Fitting model: XGBoost_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.8958 = Validation score (accuracy)
0.53s = Training runtime
0.02s = Validation runtime
Fitting model: NeuralNetTorch_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.8948 = Validation score (accuracy)
18.45s = Training runtime
0.06s = Validation runtime
Fitting model: LightGBMLarge_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.8957 = Validation score (accuracy)
0.98s = Training runtime
0.02s = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
0.897 = Validation score (accuracy)
2.89s = Training runtime
0.02s = Validation runtime
AutoGluon training complete, total runtime = 48.89s ... Best model: "WeightedEnsemble_L2"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20240126_050219/")
Autogluon(0.3 / 0.005)
df = throw(fraudTrain, 0.3)
result4 = auto(df,0.005)No path specified. Models will be saved in: "AutogluonModels/ag-20240126_051650/"
Presets specified: ['best_quality']
Stack configuration (auto_stack=True): num_stack_levels=0, num_bag_folds=8, num_bag_sets=1
Beginning AutoGluon training ...
AutoGluon will save models to "AutogluonModels/ag-20240126_051650/"
AutoGluon Version: 0.8.2
Python Version: 3.8.18
Operating System: Linux
Platform Machine: x86_64
Platform Version: #38~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 2 18:01:13 UTC 2
Disk Space Avail: 604.41 GB / 982.82 GB (61.5%)
Train Data Rows: 14014
Train Data Columns: 1
Label Column: is_fraud
Preprocessing data ...
AutoGluon infers your prediction problem is: 'binary' (because only two unique label-values observed).
2 unique label values: [1, 0]
If 'binary' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Selected class <--> label mapping: class 1 = 1, class 0 = 0
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 22354.36 MB
Train Data (Original) Memory Usage: 0.11 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Stage 5 Generators:
Fitting DropDuplicatesFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('float', []) : 1 | ['amt']
Types of features in processed data (raw dtype, special dtypes):
('float', []) : 1 | ['amt']
0.0s = Fit runtime
1 features in original data used to generate 1 features in processed data.
Train Data (Processed) Memory Usage: 0.11 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.02s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
To change this, specify the eval_metric parameter of Predictor()
User-specified model hyperparameters to be fit:
{
'NN_TORCH': {},
'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, 'GBMLarge'],
'CAT': {},
'XGB': {},
'FASTAI': {},
'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
Fitting 13 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ...
0.8747 = Validation score (accuracy)
0.0s = Training runtime
0.01s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ...
0.8688 = Validation score (accuracy)
0.0s = Training runtime
0.02s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.8874 = Validation score (accuracy)
0.92s = Training runtime
0.05s = Validation runtime
Fitting model: LightGBM_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.8931 = Validation score (accuracy)
0.79s = Training runtime
0.01s = Validation runtime
Fitting model: RandomForestGini_BAG_L1 ...
0.8576 = Validation score (accuracy)
0.47s = Training runtime
0.27s = Validation runtime
Fitting model: RandomForestEntr_BAG_L1 ...
0.8576 = Validation score (accuracy)
0.54s = Training runtime
0.25s = Validation runtime
Fitting model: CatBoost_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.8923 = Validation score (accuracy)
1.99s = Training runtime
0.01s = Validation runtime
Fitting model: ExtraTreesGini_BAG_L1 ...
0.8639 = Validation score (accuracy)
0.34s = Training runtime
0.3s = Validation runtime
Fitting model: ExtraTreesEntr_BAG_L1 ...
0.8638 = Validation score (accuracy)
0.39s = Training runtime
0.29s = Validation runtime
Fitting model: NeuralNetFastAI_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.879 = Validation score (accuracy)
12.46s = Training runtime
0.12s = Validation runtime
Fitting model: XGBoost_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.8933 = Validation score (accuracy)
0.6s = Training runtime
0.02s = Validation runtime
Fitting model: NeuralNetTorch_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.8901 = Validation score (accuracy)
20.35s = Training runtime
0.07s = Validation runtime
Fitting model: LightGBMLarge_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.8928 = Validation score (accuracy)
1.06s = Training runtime
0.02s = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
0.8939 = Validation score (accuracy)
2.93s = Training runtime
0.02s = Validation runtime
AutoGluon training complete, total runtime = 51.69s ... Best model: "WeightedEnsemble_L2"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20240126_051650/")
Autogluon(0.3 / 0.0005)
df = throw(fraudTrain, 0.3)
result5 = auto(df,0.0005)No path specified. Models will be saved in: "AutogluonModels/ag-20240126_051742/"
Presets specified: ['best_quality']
Stack configuration (auto_stack=True): num_stack_levels=0, num_bag_folds=8, num_bag_sets=1
Beginning AutoGluon training ...
AutoGluon will save models to "AutogluonModels/ag-20240126_051742/"
AutoGluon Version: 0.8.2
Python Version: 3.8.18
Operating System: Linux
Platform Machine: x86_64
Platform Version: #38~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 2 18:01:13 UTC 2
Disk Space Avail: 604.11 GB / 982.82 GB (61.5%)
Train Data Rows: 14014
Train Data Columns: 1
Label Column: is_fraud
Preprocessing data ...
AutoGluon infers your prediction problem is: 'binary' (because only two unique label-values observed).
2 unique label values: [1, 0]
If 'binary' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Selected class <--> label mapping: class 1 = 1, class 0 = 0
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 22269.23 MB
Train Data (Original) Memory Usage: 0.11 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Stage 5 Generators:
Fitting DropDuplicatesFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('float', []) : 1 | ['amt']
Types of features in processed data (raw dtype, special dtypes):
('float', []) : 1 | ['amt']
0.0s = Fit runtime
1 features in original data used to generate 1 features in processed data.
Train Data (Processed) Memory Usage: 0.11 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.02s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
To change this, specify the eval_metric parameter of Predictor()
User-specified model hyperparameters to be fit:
{
'NN_TORCH': {},
'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, 'GBMLarge'],
'CAT': {},
'XGB': {},
'FASTAI': {},
'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
Fitting 13 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ...
0.8746 = Validation score (accuracy)
0.0s = Training runtime
0.01s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ...
0.8656 = Validation score (accuracy)
0.0s = Training runtime
0.01s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.8898 = Validation score (accuracy)
0.8s = Training runtime
0.09s = Validation runtime
Fitting model: LightGBM_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.892 = Validation score (accuracy)
0.73s = Training runtime
0.01s = Validation runtime
Fitting model: RandomForestGini_BAG_L1 ...
0.8588 = Validation score (accuracy)
0.49s = Training runtime
0.26s = Validation runtime
Fitting model: RandomForestEntr_BAG_L1 ...
0.8588 = Validation score (accuracy)
0.58s = Training runtime
0.25s = Validation runtime
Fitting model: CatBoost_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.8931 = Validation score (accuracy)
1.71s = Training runtime
0.01s = Validation runtime
Fitting model: ExtraTreesGini_BAG_L1 ...
0.8638 = Validation score (accuracy)
0.35s = Training runtime
0.29s = Validation runtime
Fitting model: ExtraTreesEntr_BAG_L1 ...
0.8643 = Validation score (accuracy)
0.48s = Training runtime
0.35s = Validation runtime
Fitting model: NeuralNetFastAI_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.878 = Validation score (accuracy)
12.93s = Training runtime
0.13s = Validation runtime
Fitting model: XGBoost_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.893 = Validation score (accuracy)
0.5s = Training runtime
0.02s = Validation runtime
Fitting model: NeuralNetTorch_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.8905 = Validation score (accuracy)
20.83s = Training runtime
0.07s = Validation runtime
Fitting model: LightGBMLarge_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.8912 = Validation score (accuracy)
0.97s = Training runtime
0.01s = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
0.8935 = Validation score (accuracy)
3.33s = Training runtime
0.02s = Validation runtime
AutoGluon training complete, total runtime = 52.68s ... Best model: "WeightedEnsemble_L2"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20240126_051742/")
Autogluon(0.1 / 0.05)
df = throw(fraudTrain, 0.1)
result6 = auto(df,0.05)No path specified. Models will be saved in: "AutogluonModels/ag-20240126_051835/"
Presets specified: ['best_quality']
Stack configuration (auto_stack=True): num_stack_levels=0, num_bag_folds=8, num_bag_sets=1
Beginning AutoGluon training ...
AutoGluon will save models to "AutogluonModels/ag-20240126_051835/"
AutoGluon Version: 0.8.2
Python Version: 3.8.18
Operating System: Linux
Platform Machine: x86_64
Platform Version: #38~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 2 18:01:13 UTC 2
Disk Space Avail: 603.80 GB / 982.82 GB (61.4%)
Train Data Rows: 42042
Train Data Columns: 1
Label Column: is_fraud
Preprocessing data ...
AutoGluon infers your prediction problem is: 'binary' (because only two unique label-values observed).
2 unique label values: [1, 0]
If 'binary' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Selected class <--> label mapping: class 1 = 1, class 0 = 0
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 22269.36 MB
Train Data (Original) Memory Usage: 0.34 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Stage 5 Generators:
Fitting DropDuplicatesFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('float', []) : 1 | ['amt']
Types of features in processed data (raw dtype, special dtypes):
('float', []) : 1 | ['amt']
0.0s = Fit runtime
1 features in original data used to generate 1 features in processed data.
Train Data (Processed) Memory Usage: 0.34 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.03s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
To change this, specify the eval_metric parameter of Predictor()
User-specified model hyperparameters to be fit:
{
'NN_TORCH': {},
'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, 'GBMLarge'],
'CAT': {},
'XGB': {},
'FASTAI': {},
'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
Fitting 13 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ...
0.9448 = Validation score (accuracy)
0.01s = Training runtime
0.04s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ...
0.9397 = Validation score (accuracy)
0.01s = Training runtime
0.04s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9454 = Validation score (accuracy)
0.6s = Training runtime
0.05s = Validation runtime
Fitting model: LightGBM_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9527 = Validation score (accuracy)
0.58s = Training runtime
0.02s = Validation runtime
Fitting model: RandomForestGini_BAG_L1 ...
0.938 = Validation score (accuracy)
1.19s = Training runtime
0.68s = Validation runtime
Fitting model: RandomForestEntr_BAG_L1 ...
0.938 = Validation score (accuracy)
1.08s = Training runtime
0.68s = Validation runtime
Fitting model: CatBoost_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9527 = Validation score (accuracy)
1.48s = Training runtime
0.01s = Validation runtime
Fitting model: ExtraTreesGini_BAG_L1 ...
0.9413 = Validation score (accuracy)
0.5s = Training runtime
0.83s = Validation runtime
Fitting model: ExtraTreesEntr_BAG_L1 ...
0.9414 = Validation score (accuracy)
0.45s = Training runtime
0.78s = Validation runtime
Fitting model: NeuralNetFastAI_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9506 = Validation score (accuracy)
37.84s = Training runtime
0.31s = Validation runtime
Fitting model: XGBoost_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9528 = Validation score (accuracy)
0.58s = Training runtime
0.02s = Validation runtime
Fitting model: NeuralNetTorch_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9541 = Validation score (accuracy)
70.34s = Training runtime
0.2s = Validation runtime
Fitting model: LightGBMLarge_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9527 = Validation score (accuracy)
1.0s = Training runtime
0.05s = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
0.9542 = Validation score (accuracy)
7.95s = Training runtime
0.04s = Validation runtime
AutoGluon training complete, total runtime = 135.79s ... Best model: "WeightedEnsemble_L2"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20240126_051835/")
Autogluon(0.1 / 0.005)
df = throw(fraudTrain, 0.1)
result7 = auto(df,0.005)No path specified. Models will be saved in: "AutogluonModels/ag-20240126_052052/"
Presets specified: ['best_quality']
Stack configuration (auto_stack=True): num_stack_levels=0, num_bag_folds=8, num_bag_sets=1
Beginning AutoGluon training ...
AutoGluon will save models to "AutogluonModels/ag-20240126_052052/"
AutoGluon Version: 0.8.2
Python Version: 3.8.18
Operating System: Linux
Platform Machine: x86_64
Platform Version: #38~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 2 18:01:13 UTC 2
Disk Space Avail: 603.34 GB / 982.82 GB (61.4%)
Train Data Rows: 42042
Train Data Columns: 1
Label Column: is_fraud
Preprocessing data ...
AutoGluon infers your prediction problem is: 'binary' (because only two unique label-values observed).
2 unique label values: [1, 0]
If 'binary' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Selected class <--> label mapping: class 1 = 1, class 0 = 0
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 22163.4 MB
Train Data (Original) Memory Usage: 0.34 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Stage 5 Generators:
Fitting DropDuplicatesFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('float', []) : 1 | ['amt']
Types of features in processed data (raw dtype, special dtypes):
('float', []) : 1 | ['amt']
0.0s = Fit runtime
1 features in original data used to generate 1 features in processed data.
Train Data (Processed) Memory Usage: 0.34 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.03s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
To change this, specify the eval_metric parameter of Predictor()
User-specified model hyperparameters to be fit:
{
'NN_TORCH': {},
'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, 'GBMLarge'],
'CAT': {},
'XGB': {},
'FASTAI': {},
'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
Fitting 13 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ...
0.9378 = Validation score (accuracy)
0.01s = Training runtime
0.04s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ...
0.9318 = Validation score (accuracy)
0.01s = Training runtime
0.04s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9421 = Validation score (accuracy)
0.85s = Training runtime
0.07s = Validation runtime
Fitting model: LightGBM_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9476 = Validation score (accuracy)
0.79s = Training runtime
0.03s = Validation runtime
Fitting model: RandomForestGini_BAG_L1 ...
0.9312 = Validation score (accuracy)
1.2s = Training runtime
0.7s = Validation runtime
Fitting model: RandomForestEntr_BAG_L1 ...
0.9312 = Validation score (accuracy)
1.11s = Training runtime
0.7s = Validation runtime
Fitting model: CatBoost_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.947 = Validation score (accuracy)
1.55s = Training runtime
0.01s = Validation runtime
Fitting model: ExtraTreesGini_BAG_L1 ...
0.9342 = Validation score (accuracy)
0.49s = Training runtime
0.85s = Validation runtime
Fitting model: ExtraTreesEntr_BAG_L1 ...
0.9341 = Validation score (accuracy)
0.48s = Training runtime
0.8s = Validation runtime
Fitting model: NeuralNetFastAI_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9453 = Validation score (accuracy)
38.15s = Training runtime
0.43s = Validation runtime
Fitting model: XGBoost_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9467 = Validation score (accuracy)
0.52s = Training runtime
0.02s = Validation runtime
Fitting model: NeuralNetTorch_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9477 = Validation score (accuracy)
65.84s = Training runtime
0.18s = Validation runtime
Fitting model: LightGBMLarge_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9476 = Validation score (accuracy)
0.93s = Training runtime
0.05s = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
0.9485 = Validation score (accuracy)
7.95s = Training runtime
0.04s = Validation runtime
AutoGluon training complete, total runtime = 131.81s ... Best model: "WeightedEnsemble_L2"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20240126_052052/")
Autogluon(0.1 / 0.0005)
df = throw(fraudTrain, 0.1)
result8 = auto(df,0.005)No path specified. Models will be saved in: "AutogluonModels/ag-20240126_052304/"
Presets specified: ['best_quality']
Stack configuration (auto_stack=True): num_stack_levels=0, num_bag_folds=8, num_bag_sets=1
Beginning AutoGluon training ...
AutoGluon will save models to "AutogluonModels/ag-20240126_052304/"
AutoGluon Version: 0.8.2
Python Version: 3.8.18
Operating System: Linux
Platform Machine: x86_64
Platform Version: #38~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Nov 2 18:01:13 UTC 2
Disk Space Avail: 602.85 GB / 982.82 GB (61.3%)
Train Data Rows: 42042
Train Data Columns: 1
Label Column: is_fraud
Preprocessing data ...
AutoGluon infers your prediction problem is: 'binary' (because only two unique label-values observed).
2 unique label values: [1, 0]
If 'binary' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Selected class <--> label mapping: class 1 = 1, class 0 = 0
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 22155.39 MB
Train Data (Original) Memory Usage: 0.34 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Stage 5 Generators:
Fitting DropDuplicatesFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('float', []) : 1 | ['amt']
Types of features in processed data (raw dtype, special dtypes):
('float', []) : 1 | ['amt']
0.0s = Fit runtime
1 features in original data used to generate 1 features in processed data.
Train Data (Processed) Memory Usage: 0.34 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.04s ...
AutoGluon will gauge predictive performance using evaluation metric: 'accuracy'
To change this, specify the eval_metric parameter of Predictor()
User-specified model hyperparameters to be fit:
{
'NN_TORCH': {},
'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, 'GBMLarge'],
'CAT': {},
'XGB': {},
'FASTAI': {},
'RF': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'XT': [{'criterion': 'gini', 'ag_args': {'name_suffix': 'Gini', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'entropy', 'ag_args': {'name_suffix': 'Entr', 'problem_types': ['binary', 'multiclass']}}, {'criterion': 'squared_error', 'ag_args': {'name_suffix': 'MSE', 'problem_types': ['regression', 'quantile']}}],
'KNN': [{'weights': 'uniform', 'ag_args': {'name_suffix': 'Unif'}}, {'weights': 'distance', 'ag_args': {'name_suffix': 'Dist'}}],
}
Fitting 13 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ...
0.937 = Validation score (accuracy)
0.01s = Training runtime
0.04s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ...
0.9317 = Validation score (accuracy)
0.01s = Training runtime
0.04s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9419 = Validation score (accuracy)
0.92s = Training runtime
0.07s = Validation runtime
Fitting model: LightGBM_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9469 = Validation score (accuracy)
0.65s = Training runtime
0.03s = Validation runtime
Fitting model: RandomForestGini_BAG_L1 ...
0.9308 = Validation score (accuracy)
1.23s = Training runtime
0.7s = Validation runtime
Fitting model: RandomForestEntr_BAG_L1 ...
0.9308 = Validation score (accuracy)
1.14s = Training runtime
0.7s = Validation runtime
Fitting model: CatBoost_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9465 = Validation score (accuracy)
1.52s = Training runtime
0.01s = Validation runtime
Fitting model: ExtraTreesGini_BAG_L1 ...
0.9341 = Validation score (accuracy)
0.47s = Training runtime
0.82s = Validation runtime
Fitting model: ExtraTreesEntr_BAG_L1 ...
0.9341 = Validation score (accuracy)
0.45s = Training runtime
0.79s = Validation runtime
Fitting model: NeuralNetFastAI_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9451 = Validation score (accuracy)
39.1s = Training runtime
0.32s = Validation runtime
Fitting model: XGBoost_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9462 = Validation score (accuracy)
0.6s = Training runtime
0.02s = Validation runtime
Fitting model: NeuralNetTorch_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.948 = Validation score (accuracy)
71.08s = Training runtime
0.23s = Validation runtime
Fitting model: LightGBMLarge_BAG_L1 ...
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
0.9469 = Validation score (accuracy)
0.84s = Training runtime
0.05s = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
0.9481 = Validation score (accuracy)
7.99s = Training runtime
0.04s = Validation runtime
AutoGluon training complete, total runtime = 138.21s ... Best model: "WeightedEnsemble_L2"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20240126_052304/")